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^In  computing  mortality  rates  from  insurance  data,  the  unit  of  raeasurenant  used 
is  frequently  based  on  number  of  policies  or  amount  of  insurance  rather  than  on  llTes*  Than 
the  death  of  one  person  may  result  in  several  units  of'^’'death"^wlth  respect  to  tha  investiga— 
tionj  moreover,  the  number  of  unite  per  individual  may  vary  noticeably.  Thus  the  usual  larga 
sample  methods  of  obtaining  slcnificance  tests  and  confidence  intervals  for  the  true  value  of 
the  mort'lity  rate  are  not  applicable  to  these  situations.  If  tha  ntimber  of  unita  associated 
with  each  person  in  the  investigation  were  known,  accurate  large  sample  results  could  ba 
obtained;  however,  determination  of  the  number  of  units  associated  with  each  individual  would 
require  an  extremely  large  amount  of  work.  This  article  presents  some  valid  large  sample 
tests  and  confidence,  intervale  for  the  mortality  rate  which  do  not  require  much  work  and  ara 
reasonably  efficient.  More  general  situations  are  also  considered.  (.  )  ^ 

Introduction.  Let  us  consider  a  large  number  n  of  sample  values  from  a  binoadal  popula¬ 
tion  for  which  q  is  the  probability  of  a  failure.  If  <l'  is  the  fraction  of  failures  in  this 
sample,  asymptotically  (n — >03)  the  distribution  of 

(q'_  q)  >^/q'  ( 1  —  q' ) 


is  noimal  with  zero  mean  and  unit  variance.  This  quantity  can  be  used  to  obtain  large  sam;d.# 
confidence  Intervals  and  significance  tests  for  q.  In  particular,  these  results  can  be  used 
to  obtain  large  sample  tests  and  confidence  intervals  for  the  rate  of  mortality  when  tha 
investigation  is  based  on  lives.  Then  n  represents  the  number  of  individuals  under  observa¬ 
tion,  q  is  the  probability  that  an  individual  dies  within  the  interval  of  time  considered 
(i.e,,  the  rate  of  mortality  for  this  Interval),  and  q'  equals  the  number  of  deaths  during 
the  interval  divided  by  n.  Here  the  rate  of  mortality  q  can  also  be  interpreted  as  tho 
expected  value  of  the  fraction  of  deaths  among  the  people  under  observation  during  the 
specified  time  interval. 

Now  let  us  consider  situations  where  the  rate  of  mortality  investigated  is  one  of 

(a) .  The  expected  value  of  the  fraction  of  the  total  number  of  policies  under  investiga.— 

tion  which  are  paid  within  the  interval  of  time  considered, 

(b) .  The  expected  value  of  the  fraction  of  the  total  amount  of  Insurance  in  force  which 

is  paid  during  the  soeclfied  interval  of  time. 

In  case  (a)  each  insurance  policy  is  a  unit  of  the  investigation  while  in  case  (b)  the  unit 
is  some  specified  amount  of  insurance  (e.g.,  ^100  worth).  For  both  cases  let  n  be  the  number 
of  individuals  under  observation,  m.  the  number  of  units  associated  with  the  person 
(1  ■  1,  ’**,  n;  >  m^  ■  m).  Then,  if  q  is  the  probability  of  a  person  dying  within  tho 
specified  interval  of  time,  for  both  (a)  and  (b)  the  rate  of  mortality  is  given  by 
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Thus  the  rstts  of  morUlltj  bsssd  on  lirss,  policiss,  and  sacnintt  art  ths  saas  for  this  sit¬ 
uation*  In  easss  (a)  and  (b),  ths  usual  sstiaAts  for  ths  oorrsspooding  aortality  rats  is  ths 
nuabsr  of  units  paid  during  ths  spscifisd  tijM  intsrral  diridsd  bj  m;  1st  us  dsnots  this 
sstlaats  by  q^'.  Ths  sxpsctsd  ralus  of  q"  is  q  whils  its  rariancs  squals  q(l-q)^  b^/b^« 

Than,  if  bsx  doss  not  bscoas  indsfinitsly  largs  as  b-^od  ,  it  follows  from  ths  Csntral 

Liait  Thsorsa  and  ths  conrsrgsncs  thsorsa  [1]  that  asTuptotically  )  ths  distribution  of 

(1)  B(q”  -  q)/Vq»(l  -  q”)^  a^ 

is  nonoal  with  ssro  asan  and  unit  rariancs.  Thus,  if  ths  a^  ars  known,  ralid  largs  saapls 
tssts  and  oonfidsnes  intsnrals  can  bs  obtainsd  for  q  whsn  ths  ir.rsstigation  is  bassd  on 
polioiss  or  aaounts.  Howsrsr,  ths  aaount  of  work  rsquirsd  to  dstsmins  ths  is  usually  so 
prohibitirs  that  uss  of  (1)  is  out  of  ths  qusstion. 

To  obtain  an  accurats  sxprsssion  for  ths  rariancs  of  q”  it  is  nsesssary  to  bars  knowlsdgs 
of  ths  raluss  of  ths  b^.  Attsapts  hars  bssn  aads  to  sstiaats  this  rariancs  without  such  know¬ 
lsdgs  but  ths  rssulting  sxprsssions  ars  at  bsst  sxtrsasly  rough  approxiaations  and  usually 
apprsciably  xindsrsstiiBats  ths  trus  ralus.  For  sxaapls,  in  inrsstigations  bassd  on  policiss  it 
is  soastimes  assusMd  that  ths  rariancs  of  q"  is  approxiaatsly  squal  to  q'*(l-<f)/a«  Thsn 

(q"  -  q)V>^q”(l  -  q”) 

is  ussd  xindsr  ths  assumption  that  its  asyaptotic  distribution  is  nomal  with  Ksro  asan  and  unit 
rariancs.  This  can  Isad  to  absurd  rssults.  As  an  sxaapls,  1st  ths  arsrags  nuidbsr  of  polioiss 
assooiatsd  with  sach  psrson  bs  at  Isast  two.  Thsn  ths  rariancs  of  this  quantity  is  not  unity 
but  is  two  or  grsatsr. 

Ths  pux*poss  of  this  papsr  is  to  prsssnt  sons  accurats  largs  saapls  tssts  and  confidsnes  in¬ 
tsnrals  for  ths  rats  of  aortality  which  ars  rsadily  ooaputsd  for  ths  usual  typs  of  insurance 
data.  Thsss  nsults  do  not  furnish  as  such  "information'^  as  ths  tssts  and  intsnrals  bassd  on 
(1)  but  ordinarily  this  loss  of  sfficisney  is  grsatly  outwsighsd  by  ths  coofMtational  sarings. 
Co^jarsd  to  ths  rssults  bassd  on  (1),  ths  power  efficiencies  of  ths  tssts  presented  are  in  ths 
nsi^borhood  of  70%,  This  means  that  ths  "inforaation"  obtainsd  by  applying  thsss  tssts  to  all 
ths  obssrrations  is  approximately  ths  saos  as  ths  "information"  obtainsd  by  applying  ths  corres¬ 
ponding  tssts  bassd  on  (l)  to  only  70$  as  many  obssrrations.  Sines  ths  aaount  of  data  is  hugs 
for  most  insurance  inrsstigations,  howsrsr,  this  "loss"  of  30$  of  ths  obssrrations  Is  usually 

not  of  great  importance.  An  intuit  ire  explanation  of  ths  meaning  of  power  sfficisney  is  girSn 
in  [2]. 

For  many  insurance  inrsstigations ,  ths  probability  of  death  within  ths  specified  time 
intsrral  is  not  ths  same  for  all  ths  indiriduals  under  obssnration.  Instead,  ssrsral  different 

classes  of  risks  ars  ooabinsd  and  it  is  desired  to  find  ths  rats  of  aortality  for  ths  combined 
group*  Thsn  uss  of  (1)  is  no  longer  appropriate.  Howsrsr,  if  certain  uniformity  conditions 
hold  with  respect  to  ths  alphabetical  distribution  (last  name)  of  ths  members  of  ths  different 
classsys  of  risks  and  with  respect  to  ths  distribution  of  ths  units  among  ths  indiriduals,  ths 
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l6sts  and  intervals  presented  in  this  .Tjrticle  are  still  applicable*  It  seeiae  likely  that 
these  unifomity  conditions  will  be  approxinately  satisfied  for  the  usual  type  of  situation 
if  the  n\sat>er  of  individuals  in  each  class  of  risks  is  vejy  large  while  the  loaxlsiSB  nuDber  of 
units  per  individual  is  very  snail  compared  to  the  total  number  of  units  under  investigation* 

An  extension  of  these  results  to  more  general  types  of  situations  is  presented  in  the 
Appendix. 

Outline  of  Method.  First  let  us  consider  the  case  where  the  probability  of  death  within 
the  specified  time  interval  is  the  same  for  each  person  of  the  investigation;  also  these 
individuals  represent  statistically  independent  observations.  The  problen  is  to  obtain  easily 
applied  tests  and  confidence  intervals  for  the  cannon  probability  of  death  (i.e. ,  rate  of 
mortality)  vdien  the  investigation  is  based  on  policies  or  amounts. 

As  the  first  step  of  the  cethod,  let  the  total  nuober  of  units  be  divided  into  26  sub> 
groups  on  the  basis  of  the  first  letter  of  the  last  name  of  the  people  Insured.  Since  it  is 
not  a  common  occurrence  for  ttie  same  person  to  have  insurance  under  last  names  beginning  with 
different  letters  of  the  alphabet,  these  subgroups  can  be  considered  as  approximately  statist 
tically  independent.  Next,  in  scmie  previously  spiecified  manner,  combine  some  of  these  sub¬ 
groups  until  10  15  subgroups  containing  approximately  the  sane  ntnaber  of  units  are  obtained. 

These  subgroups  are  also  statistically  independent.  For  each  of  these  subgroups  compute  the 
fraction  consisting  of  the  nuober  of  units  paid  duxd.ng  the  specified  time  interval  divided  by  ' 
the  number  of  units  in  the  subgroup.  Let  q^,  •••,  q"  denote  the  resulting  statistics.  Thai, 
using  the  same  argument  as  for  (1),  asymptotically  (m-»ao)  these  fJractlons  represent  a  set  of 
r  independent  observations  fiom  normal  popilations  with  connon  mean  equal  to  q.  Thus,  if  the 
number  of  units  investigated  is  very  large,  it  is  approximately  true  that  q^'  ,  •••,  q^  is  a 
set  of  independent  observations  from  continuous  symmetrical  populations  with  coaaon  median 
equal  to  the  rate  of  mortality.  Consequently  the  results  of  [3]  and  [4]  are  directly  appli¬ 
cable  for  finding  confidence  intervals  and  sigiificance  tests  for  q  on  the  basis  of  q^,  q”. 

Table  1  contains  a  list  of  some  one-sided  and  sjaxaetrlcal  significance  tests  for  comparing 
q  with  a  given  hypothetical  value  q^  for  10  <  r  <  15  (Xj^,  •**,  x^  represent  the  values  of 
q^,  ***,  q”  arranged  in  Increasing  order  of  ma^itude).  Additional  tests  can  be  obtained  from 
[3,  Table  1]  and  by  use  of  the  theory  developed  in  [4~.  The  correspondii^  confidence  intervals 
and  confidence  coefficients  can  be  obtained  from  these  tests  in  the  usual  manner*  The  point  to 
be  remembered  in  converting  from  these  tests  to  the  corresponding  confidence  intervals  is  that 
the  significance  level  of  a  test  equals  the  probability  of  the  relation  defined  by  the  test 
holding  when  q^  ■  q.  For  the  tests  considered  here  this  automatically  gives  the  probability 
that  a  certain  interval  does  not  include  the  true  value  of  q,  whence  the  confidence  coefficient 
of  that  interval  is  determined.  As  an  example,  let  us  consider  the  case  where  r  ■  14*  Then 
the  one-sided  test 

Acce£t  q  <  q^  if  max  [x^q.  i  (x^  ♦  x^)]  <  q^  , 


with  lj{  significance  level  yields  the  one-sided  confidence  interval 
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•  ' 

with  97%  confidence  coefficient;  i,e.,  the  probability  that  the  relation 

-  00  <  q  <  max  [xj^q,  |  (x^  ♦  Xj^)] 

holds  equals  995S.  Similarly  the  correapondinfj  symmetrical  test  j^elds  the  confidence 
interval 

(min  [*5.  I  (x^  ♦  X,)],  max  [x^q,  |  (x^  ♦  xj\^ 
with  9855  confider.ee  coefficient. 

One  application  of  confidence  intervals  for  the  rate  of  mortality  occurs  when  graj^hical 
interpolation  is  used  in  the  construction  of  a  mortality  table.  Here  the  true  value  of  the 
i  rate  of  mortality  varies  with  age.  For  each  age  it  is  useful  to  have  confidence  intervals 
for  the  corresponding  rate  of  mortality.  The  confidence  intervals  employed  are  usually 
symetrical  and  have  confidence  coefficients  which  vary  fjrom  50;?  to  ax^^und  905o.  '  The  pro¬ 
cedure  used  in  graphical  graduation  is  to  choose  one  or  more  confidence  coefficient  values 
and  then  obtain  confidence  intervals  with  these  confidence  coefficients  for  each  age  (one 
confidence  interval  for  each  confidence  coefficient).  These  confidence  intervals  are  then 
plotted  on  an  age  versus  mortality  rate  graph  (along  with  the  mortality  rate  estimates). 

The  person  performing  the  graphical  graduation  is  guided  by  these  confidence  intervals  when 
drawing  the  curve  which  is  to  represent  the  graduated  mortality  rates.  He  attempts  to  draw 
this  curve  so  that  for  each  set  of  confidence  intervals  with  a  common  coefficient  the 
percentage  of  ages  where  the  curve  lies  within  the  confidence  inter/als  is  approximately 
equal  to  the  value  of  the  confidence  coefficient.  For  ex.ajnple,  consider  the  case  where  the 
[  confidence  coefficient  valxies  chosen  are  60^  and  80:S,  Then  for  each  age  there  is  a  con¬ 
fidence  interval  with  coefficient  60%  and  a  confidence  interval  vdth  coefficient  805$,  The 
graduator  would  attempt  to  draw  the  curve  so  that  it  lies  witldn  the  confidence  intervals 
with  coefficient  60^  for  about  60^  of  the  ages  and  v/ithin  the  confidence  intervals  with 
coefficient  80$  for  about  8O5S  of  the  ages, 

computed  syonetrieal  confidence  intei*vsls  for  the  rate  of  mortality  with  confi¬ 
dence  coefficients  in  the  range  50$--90$  can  be  obtained  by  applying  the  results  of 
•••,  q*.  Table  2  contains  a  list  of  some  syuanetrical  confidence  intervils  for 
10  <  r  <  15,  By  applying  the  method  outlined  at  the  beginning  of  this  section  separately  to 
each  agSy  sets  of  confidence  intejrvals  suitable  for  use  with  respect  to  grapldcal  graduation 
are  readily  obtained. 

Now  lot  us  consider  the  case  w'here  the  probability  of  death  'within  the  specified  time 
interval  is  not  the  sane  for  each  person  of  the  investigation.  Then  the  method  of  obtaining 
large  sample  tests  and  confidence  Intervals  for  the  rate  of  mortality  based  on  (1)  is  not 
neceaaarlly  applicable.  The  raeti.od  presented  in  this  section  is  valid,  however,  if  the 
alphabetical  distriVution  of  the  units  and  of  tlie  different  types  of  risks  is  such  that 
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q£,  '**’  7  iiB.'n  the  ease  expected  value.  This  result  is  a  ooosequenee  of  the  mterlsl 
presented  In  the  Appendix.  If  the  ninher  of  imlts  per  Indlvldnal  Is  very  wall 

ocnpared  to  the  total  nrsber  of  units  In  the  Inveetl^tlae  and  the  variation  In  the  proba¬ 
bility  of  death  Is  not  great  for  the  IxidlvldtialS  considered.  It  appears  likely  that  this 
relation  vlU  be  approzhaately  satisfied  for  the  usual  type  of  Investigation  Involving  an 
extreaely  larse  amber  of  units.  This  contention  Is  based  on  the  Intuitive  observation  that 
on  the  average  the  probability  of  death  does  not  depend  on  the  first  letter  of  the  last  nsae 
of  the  pomon  considered. 

The  essential  property  of  the  method  outlined  Is  the  division  of  the  observations  Into 
statistically  Independent  subgroups  such  that  the  expected  values  of  the  observed  mortality 
rates  for  those  subgroups  have  a  coOBan  valus.  'Iny  method  idilch  lias  this  property  and  8atls> 
flee  the  osynptotlo  nomality  requirsosnt  Is  eligible  for  use.  The  alpbabetloal  meuhod 
presented  vaa  chosen  because  It  appeared  to  have  computational  advantages. 

Efficiency  Jnvnatlmtlon.  The  tests  and  confldonoe  intervals  presented  In  the  preceding 
section  are  rec ciimtondo<l  by  their  ease  of  ccaiputatlon  and  generality  of  applloation.  These 
favorable  points  vould  be  of  liwtle  value,  however,  If  these  tests  and  confidence  Intervals 
should  happen  to  be  extromcly  Inefficient.  The  purpose  of  this  section  Is  to  find  pewer 
offislencles  for  tbs  tests  of  Table  1  for  the  special  case  where  each  of  the  r  subgroiipe 
contalne  the  some  nmAor  of  units  and  q^,  ***,  q*  have  the  same  variance  os  veil  as  the  sane 
expected  value.  Here  it  Is  not  assmed  that  the  probability  of  death  Is  the  ssoe  for  each 
porson  of  the  Investigation. 

let  q  be  the  true  value  of  the  rate  of  mortality  (l.e.,  the  expected  fraction  of  the 
total  number  of  units  under  Investigation  which  are  paid  during  the  specified  time  interval) 
and  a  the  octaaon  variance  of  n^,  ••*,  q^.  Uloder  very  general  eotkiitloiis  (see  Appendix), 
as;yTiptotlcally  q^ ,  *  * ' ,  o "  represent  the  values  of  a  random  sample  from  a  normal  population 
with  mean  q  and  variance  <y',  .ilso  for  a  rather  vide  class  of  situations  —  >  Is  an 
ef'^lclent  estimate  of  q  (see  [^3  definition  of  efficient  estimate) .  The  class  of  sltua* 
tlons  where  these  noxnallty  and  offlclenoy  conditions  are  approximately  satisfied  vould  sesc 
to  include  most  Insurance  invest igations  based  on  a  large  nmber  of  units.  For  example, 
theso  conditions  hold  If  the  total  group  of  Individuals  can  bs  subdivided  Into  a  finite  and 
fixed  number  of  classes  such  that  the  pi^^bablllty  of  death  Is  the  some  vlthln  classes  (but 
different  for  different  classes)  while  asymptotically  the  nimber  of  units  In  each  class 
boocnos  indefinitely  large  and  the  ma-riTn  nmnber  of  units  per  person  is  bounded.  This  result 
foll<.)V8  from  tlie  application  of  laaxlmiga  likelihood  theory  to  this  situation.  In  the  remainder 
of  this  section  it  will  be  assmed  that  asymptotically  ***,  are  a  sample  frem  a  nonaal 
population  and  that  ~  ^  is  an  efficient  estimate  of  q . 

For  the  situations  considered,  tbs  asymptotic  distribution  of  the  quantity 

(2)  ^  Z  -  q)/ 

Is  nomal  %rlth  tero  mean  and  unit  variance.  If  o^  could  be  assiaaed  known,  the  most  powerful 
one-sided  and  sywetrleal  tests  for  comparing  q  vlth  a  given  value  q^  fomed  on  the  basis  of 
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(2)  vovld  ^  at  iMat  80  powrful  as  ths  corrsapooilng  most  powarfid  tests  for  the  case  where 
^  la  lakBxam.  This  follow#  fro#  ths  fact  that  ^  ^  qj  la  an  efficient  eetlaate  of  q.  Thus 
povar  •fflolmcles  ocaputsd  using  (2)  to  furnish  the  most  powerful  tests  (o^  kaown^  will 
hs  Isas  *-*«»"  or  equal  to  the  power  efflolsncls#  hasad  on  the  moat  powerful  teats  for  o- 
lakBOsn;  l.a.,  coMarmtlTS  waluas  will  ba  obtained  for  the  power  efficiencies.  In  the 
foUowlag  analysis  tha  ®ost  powerful  tests  used  as  a  basis  for  finding  power  offlolenclos  will 

ba  those  obtained  by  using  (2)  under  the  asaxa^tlon  of  known  . 

Xlio  aethod  of  deflnlixg  power  efficiency  given  In  M  and  intuitively  explained  In  [2] 
iHii  be  uMd  here.  Essentially  this  a&ounts  to  adjusting  the  sample  size  for  the  correspond  - 
lag  Most  powerful  teat  (ssim  slffilflcance  level)  until  Ita  pjwer  function  is  approximately 
equal  to  ths  power  funotlon  for  the  given  toat.  The  resulting  sample  else  for  the  most 
powerful  test  divided  by  the  sample  site  for  the  given  tost  la  called  the  power  efficiency 
of  the  given  tost.  As  pointed  out  in  [u],  for  the  tests  of  Table  1  it  is  sufficient  to 
investigate  one-sided  teste  of  q  <  The  power  efficiency  of  the  corresponding  one-sided 

teet  of  q  >  q^  (e^  eigilflcance  level)  has  the  same  value  as  the  power  efficiency  as  a 

has  the 


m  -.Q  '  ^ 

teat  of  q  <  q  ;  also  the  syMaetrloal  test  of  q  /  q^  baaed  on  these  two  one-sided  tests 


power  efficiency  as  the  one-sided  tests. 

The  power  fxasBtlcn  of  the  most  powerful  test  of  q  <  based  on  s  sample  values 
•  * ' ,  y^  froM  a  nuiaal  population  with  unknown  mean  q  and  known  variance  o*  equals 

P«‘[V*  (y  ’  -  Kq]  -  Pr  [V*  (y  -  l)/o'<  -  ^  ♦  V®  (Iq  - 


(3) 


V5t  u-  00 


o  is  the  aiffilfloanoe  level  of  the  teat,  S  *  (q^  ■  q)/o»  ^  defined  by  the 


relatlam 


1 

VS?  Js^ 


for  the  eltuatloM  considered,  this  expreesee  the  power  function  of  the  njet  powerful  teat  of 
q  <  q^  at  algnif loanee  level  a  os  a  function  of  the  parameter  S.  Thus,  given  a  one-sided 
teat  of  q  <  q^  at  aigilfloance  level  a  from  Table  1,  the  problem  is  to  determine  the  value  of 
8  SO  ttet  the  power  function  (3)  le  approximately  the  same  as  the  power  f\aictlon  of  the  given 
teat  (both  power  functions  expressed  as  functions  of  the  parameter  (S)  .  Division  of  the 
rssnltlng  valns  of  s  by  the  value  of  r  yields  the  power  efficiency  of  the  Table  1  test 
considered.  Here  e  is  allowed  to  asaime  non- Integral  values  (eeo  [2^  for  interpretation  of 
non-integral  sample  sizes) . 

Table  3  a  Hat  of  power  fUDctlcn  values  for  the  teste  of  Table  1.  These  power 

fnntlon  values  were  taken  from  [4]  and  [^.  The  power  function  values  for  the  oorroepondlng 
teats  bTT-^  on  (2)  were  obtained  by  use  of  (5)  for  fractional  values  of  o.  The  values  of  s 
givsB  la  Table  5  were  med  to  cenpute  the  approximate  power  efflclenclse  listed  In  Table  1. 
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Althou^  only  a  special  ease  vas  eoosldered,  the  power  efriclenc/  Investigation  of  tM* 
seotlon  wonld  seen  to  Indicate  that  the  testa  preewxted  In  this  paper  are  euffloientlj  effi¬ 
cient  to  be  of  practlceLL  value.  Ko  confidence  interval  efficiency  investigations  vlU  he  asde. 
However,  the  close  relationship  botweon  a  confidence  intorvul  anri  a  test  hosed  co  this  latermi 
indicateo  that  the  confidence  intervals  considered  in  this  paper  will  have  reasonably  hi^ 
efficiencies . 

Practical  Appllcatlan.  How  lot  consider  a  procedure  for  applying  the  method  of  this 
paper  to  an  actual  insurance  investigation.  Here  the  problm  is  to  ccnpute  the  valuss  of  the 
<1^,  ^  In  such  a  manner  that  the  usual  point  estlnate  for  the  rate  of  mortality  is 

obtained  in  the  same  operation.  (The  value  of  this  point  estimate  equals  the  total  niBber  of 
units  paid  divided  by  the  total  number  of  units  exposed  to  risk.) 

In  addition  to  the  usual  Infonaation  listed  for  the  investigation,  the  first  letter  of 
the  last  name  of  the  person  Insured  must  be  recorded  for  each  policy.  Then  the  totality  of 
units  is  divided  Into  r  subgroups  co  the  alphabetical  basis  previously  mentioned.  Separately, 
for  each  of  these  r  subgroups,  the  units  exposed  to  risk  and  the  "deaths"  (l.e.,  units  paid) 
are  obtained  In  the  usual  manner  (see,  e.g.,  [?!)  •  The  ratio  of  "deaths"  to  exposed  to  risk 
for  each  subgroup  yields  the  values  of  q  To  obtain  the  usual  point  estimate  for 

the  mortality  rate,  the  totality  of  "deaths"  for  all  subgroups  is  divided  by  the  sum  of  the 
eo^oaed  to  risk  for  all  subgroups. 

The  sethod  propossd  differs  fros  the  ordinary  method  of  obtaining  the  point  estimate  for 

the  rate  of  mortality  in  tvo  respects.  First,  thoro  is  the  additional  step  of  recording  the 
first  letter  of  the  last  name  of  each  poraon  to  the  investigation.  Second,  the  exposed  to 
risk  and  "deaths"  ore  obtained  separately  for  subgroups  rather  than  for  the  combined  data. 

If  the  procedure  of  recording  the  first  letter  of  the  last  name  is  instituted  at  the  Initial 
stage  of  the  investigation,  it  would  aoom  that  little  extra  effojrt  is  required  for  this 
recording.  If  the  investigation  is  large  and  punched  card  equipment  is  used,  sorting  the 
units  into  the  inquired  subgroups  and  computing  tlie  exposed  to  risk  and  "deaths"  separately 
for  each  subgroup  would  not  appear  to  necessitate  substantially  more  work  than  is  required 
to  obtain  the  point  estimate.  Thus  the  method  of  this  paper  is  easily  applied  frcn  a  oan- 
putational  vicjwpolnt  if  the  alphabetical  information  is  recorded  when  the  other  infonaation 
for  the  investigation  is  obtained. 

According  to  present  practice,  however,  no  alphabetical  information  is  recorded  for 
mortality  Invostl^tiona .  Th-js  obtaining  this  extra  data  for  studies  already  begun  would 
require  much  additional  work.  For  studies  not  yet  begun,  however,  recording  the  alphabetical 
Infomation  would  require  little  additional  effort. 

With  respect  to  computation  of  the  exposed  to  risk  and  the  "deaths",  the  formulas  used 
are  applied  to  situations  whore  some  of  the  people  are  only  exposed  for  a  fraction  of  the 
specif iod  time  interval.  This  would  not  api>ear  to  invalidate  the  cons Ideratl one  of  the 
previous  sections.  Similarly  for  the  approximations  used  in  computing  the  number  of  units 
exposed  to  risk. 

Appendix.  The  teste  and  confidence  intervals  for  the  rate  of  mortality  presented  In  this 
paper  are  special  examples  of  some  general  awymptotlc  results.  This  section  contalxw  an 


Let  ^  indlepenlent  o'baarv&tlcM  be  drown  frcm  poptilAtloma  aatlafylne  the  cooditlooa 
(i) .  The  first  three  sMBanta  of  each  r'^pulatlon  are  finite. 

(11) .  There  ezlat  two  fixed  poeltlwe  nuabere  auoh  that  the  value  of  the  variance  of  each 
population  lies  between  theae  niaibere . 

It  la  to  be  enphaelsed  that  no  two  obaermtlona  are  neeos^arlly  drr.vn  frca  the  aaae  population 
that  BO  population  la  neceaeaxllj’  eontlnuoua.  Thooe  ^  n^  obeorvatlona  are  drawn  aa  k  aeta 
of  n^,  •*•,  n^  obaervatlona,  reopootlvely.  Fon  the  neona  oi*  theae  k  aeta.  Let 

E(yi)  =  (1  -  1»  •••,  k),  and  cooaider 

[h  *  ^  [^k  -  ^  ■ 

Let  B^B  — >(S>.  Then,  by  the  Caatral  Ll*lt  ThooreB,  In  the  limit  k  independent  obeervationa 

are  obtained  which  are  fron  eontlnuoua  ayviatrloal  populatlona  with  zero  ciodlana  (In  fact, 
aaynptotloall^  each  o'^aervatlcci  baa  a  noraal  dlatrlbutlon  with  zero  mean  but  uniOioun  finite 

noB'Zero  warlaaeo) . 

The  above  aajBptotle  reault  ahowa  that  yj^,  ••*,  yj^  are  Independent  obaervatlona  fro 
populatlona  irfiioh  are  very  nearly  eontlnuoua  and  ayuaetrlcal  with  mod  lane  ^^(n^),  '**,  > 

reapeetlvely.  If  nln  la  aufflolently  large.  Thua,  If  min  n^  la  large  and  =  •••  = 

^(iIjP  m  •••,  frequently  pexttlaalble  to  obtain  teete  and  confldonoe  Intervals 

for  Knj^,  •  •  • ,  UjP  by  applying  the  reaulta  of  [  5  ]  and  [  ^  J  to  y^^,  ’  ’ ' , 

It  la  to  be  noted  that  conditions  (1)  end  (ll)  are  not  very  rretrlctlve  frcti  a  practical 
viewpoint.  Nearly  *11  popoilatlons  appi\jxljnated  in  practice  oatiafy  condition  (1)  .  \lao, 
populatlcBS  with  arbitrarily  ewall  (near  zero)  or  large  variances  oro  soluio.  If  ever, 


approximated  In  practical  sltuatlona. 
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TABLE  1 


30ME  ONE-;IDED  AND  JY»C-2T.^GAL  Ti.  -TJ  FOR  10  <  r  <  15 


r 

Significance 
Level  of 
Teste 

TESTS 

SYMMETRICAL*  Accept  Q  /  if  either 

Approx- 

mate 

Effi¬ 

ciency 

One¬ 

sided 

Synriet- 

rlcal 

OME-CIDED* 

Accept  q  if 

ONE-SIDED* 

Accept  q  >  if 

lo'" 

5.6,t 

11.  W 

L*6*  2  ^  <  % 

73$ 

5.1,‘5 

max 

2  ^*5  *  ^0^]  <  **0 

72$ 

— 

2.1,? 

max 

*3«  I  •  *10>]  Si 

min 

S’  i  S  *  ’‘5>]  ’’  S) 

67$ 

C.5« 

l.O.t 

max 

^  (*4  •  XjJ  <  q„ 

min 

^>1 1  <*1  •  "S*]  =”  Si 

59$ 

11 

2.8^ 

bM 

max 

2  ^*5  ^  *ll3  % 

min 

X5,  1  (*1  ♦  X.7)]  >  <>0 

'  70$ 

0.5!< 

1.1^ 

max 

S- 1  W  '  *u)]  *=  S. 

64.5$ 

12 

IM 

2.0$ 

max 

S’  1  '^6  *  ■^3 "  s. 

oiln 

2  ^  \ 

69$ 

13 

o,ri 

1.0;S 

max 

-*10»  2^^  *  ^3^ 

min 

[V  2  ^*1  ^  *7^-1  ^  **0 

67$ 

B 

l.oj 

2.0$ 

max 

So’  2  <*6  •  '  Si 

min 

[x^,  1  (Xj^  ♦  x^)]  >  q^ 

69$ 

B 

1.0:S 

max 

’*U’  2  S  •  ■^5>]  ^  Si 

min 

S’  1 S  •  Si 

67.5$ 

TABLE  2 
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TABLE  3 


Significance  Test 


max  [x^,  i  (x^  ♦  x^^)]  < 


(2) 

max  [x^,  i  (xj  ♦  x^Q^  <  q,, 


(2) 

[^9»  2  ^*6  ^  ^  % 


(2) 

max  [x^.  I  (x^  .  x^q)]  <  q^ 


(2) 

m^ix  [x^,  i  (x^  ♦  x^)J  ^  <10 


(2) 

max  [xg,  §  (x^  ♦  x^)]  <  q^ 


(2) 

max  [x^,  I  (x^  ♦  <  % 


(2) 

max  [x^Q,  I  (x^  -  x^^)]  <  q^ 


(2) 

[^0»  2  ^^6  ^  <  %> 


(2) 

max  [x^.  I  (x^  ♦  x^^)]  <  q^ 


Signlf. 
Cample  icancc 
Size  Level 


Values  of  Power  Function 


6^  1.8 


